Dashboard Demonstration



This dashboard shows a handful of data visualization use cases for the Unstructured Data Analytics initiative of the Allegheny County Department of Public Health. This was developed by staff at Chapin Hall and its partners at Argonne National Laboratory.

  • This was made with the free and open-source R programming language
  • Code-driven dashboards like this can be automatically generated to be custom for any given user, and updated with new inputs, analysis, and visualization as part of the full analysis pipeline
  • The code and data used to create this example can be found—and reused and contributed to—at this open-source GitHub repository
  • Note that all data here is fake, intended only for the purpose of illustration

Case-Level Visualization


System-Level Trajectory Visualization


  • Trajectory visualizations can show pathways of cases through institutional contacts
  • Each individual trajectory is shown as a (narrow) horizontal line. Each subplot shows hundreds of individual cases, with cases sorted by their starting state, then the type and duration of the state that follows.
  • These plots show the starting age, duration, sequence, and number of spells for different institutional contacts
  • The multiple figures show different distinct cluster patterns. In this example:
    • “Cluster 1” represents individuals with short involvements which start at a range of ages
    • “Cluster 2” represents individuals with long, unbroken stints in child welfare cases. A third of this group has out-of-home placements, almost always at early ages.
    • “Cluster 3”–the smallest group–represents individuals with long stints of out-of home placements, generally starting at early ages.
  • These institutional involvement cluster types can also be displayed in tables or maps

Macro-Planning


System-Level Topic Map


---
title: "ACDHS Dashboard Demo"
author: "Chapin Hall"
date: "August 6, 2018"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
    logo: img/AGC-triangle-small.png
    css: flexdash.css
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warnings = FALSE, error = FALSE)

### Load and--if necessary--install packages ----------------------------------#
package.list <- c("flexdashboard", "data.table", "dplyr", "stringr", "rgdal",
                  "MASS", "plotly", "leaflet", "ggplot2", "scales", "DT")
for (p in package.list){
  if (!p %in% installed.packages()[, "Package"]) install.packages(p)
  library(p, character.only = TRUE)
}

### Load custom functions -----------------------------------------------------#
grepv <- function(p, x, ...) grep(p, x, value = TRUE, ...)
cn <- function(x) colnames(x)

```

### Dashboard Demonstration



This dashboard shows a handful of data visualization use cases for the Unstructured Data Analytics initiative of the Allegheny County Department of Public Health. This was developed by staff at Chapin Hall and its partners at Argonne National Laboratory.

  • This was made with the free and open-source [R programming language](https://www.r-project.org/about.html)
  • Code-driven dashboards like this can be automatically generated to be custom for any given user, and updated with new inputs, analysis, and visualization as part of the full analysis pipeline
  • The code and data used to create this example can be found—and reused and contributed to—at [this open-source GitHub repository](https://github.com/chapinhall/acdhs-demo)
  • **Note** that all data here is fake, intended only for the purpose of illustration
### Case-Level Visualization ***
  • This figure combines
    • text-mined sentiment score
    • text-mined key topics of relevance
    • indication of institutional contacts, including DHS investigation and intervention
  • Low sentiment scores could be used to trigger specific interventions or referrals to other practitioners, when combined with key topic combinations
  • Topic and sentiment scores can help supervisors quickly review timely characteristics of individual cases or a worker's entire caseload
  • Persistence of key topics can raise flags to QA teams to complete additional modules, e.g. related to housing concerns
### System-Level Trajectory Visualization {data-commentary-width=500} ```{r generate prevalence figure to match trajectory viz} df <- data.frame(clus = c("Cluster 1", "Cluster 2", "Cluster 3"), freq = c(5442, 679, 227)) %>% mutate(pct = freq/sum(freq), label = paste0("n = ", prettyNum(freq, big.mark = ","))) myplot <- ggplot(df, aes(x = clus, y = pct, fill = clus)) + geom_bar(stat = "identity", position = "dodge") + geom_text(aes(label = label), vjust = 1) + scale_y_continuous(labels = percent) + labs(title = "Prevalence of Cluster Types in Population", x = "", y = "% of Population") + theme(legend.position = "none", axis.title = element_text(size = 12), axis.text = element_text(size = 11)) ggsave("img/dually-involved-youth_traj-viz_3-cluster--prevalence.png") ``` ***
  • Trajectory visualizations can show pathways of cases through institutional contacts
  • Each individual trajectory is shown as a (narrow) horizontal line. Each subplot shows hundreds of individual cases, with cases sorted by their starting state, then the type and duration of the state that follows.
  • These plots show the starting age, duration, sequence, and number of spells for different institutional contacts
  • The multiple figures show different distinct cluster patterns. In this example:
    • "Cluster 1" represents individuals with short involvements which start at a range of ages
    • "Cluster 2" represents individuals with long, unbroken stints in child welfare cases. A third of this group has out-of-home placements, almost always at early ages.
    • "Cluster 3"--the smallest group--represents individuals with long stints of out-of home placements, generally starting at early ages.
  • These institutional involvement cluster types can also be displayed in tables or maps
### Macro-Planning ```{r generate macro planning data} mpn <- 100 Draw1to100 <- function(n, pwr) round(runif(n)^pwr*100, 0) planets <- c("Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Neptune", "Uranus") mp <- data.frame(ID = str_pad(round(runif(mpn)*1e7, 0), width = 7, side = "left", pad = "0"), AQ = paste0(sample(c(2017, 2018), mpn, replace = TRUE), "-Q", sample(1:4, mpn, replace = TRUE)), Region = factor(sample(planets, mpn, replace = TRUE), levels = planets), Sentiment = Draw1to100(mpn, 1.0), MH = Draw1to100(mpn, 2.0), DA = Draw1to100(mpn, 2.0), DV = Draw1to100(mpn, 2.0), H = Draw1to100(mpn, 2.0), E = Draw1to100(mpn, 0.5), M = Draw1to100(mpn, 1.0), SC = Draw1to100(mpn, 0.5)) %>% filter(!AQ %in% c("2018-Q3", "2018-Q4")) ``` ```{r place and format dynamic table} # Develop color scale for quantiles of all top # /!\ An alternative scheme might just be for 1-10, 11-20, etc as hard divisions topics <- c("MH", "DA", "DV", "H", "E", "M", "SC") topic_brks <- quantile(select(mp, one_of(topics)), probs = seq(.05, .95, .05), na.rm = TRUE) # Develop colors as increasing shades of blue topic_clrs <- round(seq(255, 40, length.out = length(topic_brks) + 1), 0) %>% paste0("rgb(", ., ",", ., ",255)") topicNames <- c("Mental Health", "Drug/Alcohol", "Domestic Violence", "Housing", "Education", "Medical", "Social Connection") sent_brks <- quantile(mp$Sentiment, probs = seq(.05, .95, .05), na.rm = TRUE) sent_clrs <- colorRampPalette(c("firebrick2", "forestgreen"))(length(sent_brks) + 1) colnames(mp) <- c("ID", "Assessment Quarter", "Region", "Sentiment", topicNames) rownames(mp) <- NULL datatable(mp, extensions = 'Buttons', filter = 'top', rownames = FALSE, options = list(dom = 'Bfrtip', buttons = list('copy', 'print', list( extend = 'collection', buttons = c('csv', 'excel', 'pdf'), text = 'Download')), pageLength = 20)) %>% formatStyle("Sentiment", backgroundColor = styleInterval(sent_brks, sent_clrs), color = "white") %>% formatStyle(topicNames, backgroundColor = styleInterval(topic_brks, topic_clrs)) ``` *** ### System-Level Topic Map ```{r display zipcode map with topic layers} # Get zipcode boundary data zips <- readOGR(dsn = "data", layer = "Allegheny_County_Zip_Code_Boundaries", verbose = FALSE) # Sample topics: topics <- c("Pregnant", "Domestic Violence", "Opiates", "Suicide", "Low Income") sigma <- matrix(c(1.0, 0.2, 0.1, 0.2, 0.4, 0.2, 1.0, 0.3, 0.4, 0.5, 0.1, 0.3, 1.0, 0.4, 0.6, 0.2, 0.4, 0.4, 1.0, 0.4, 0.4, 0.5, 0.6, 0.4, 2.0), nrow = 5) mu <- c(0.8, 1.5, 0.0, -1.0, 1.0) # Draw correlated values set.seed(8062018) draws <- mvrnorm(n = nrow(zips@data), mu = mu, Sigma = sigma) # Put this on a 0-100 scale, label, and arrange in order (to give a pattern to matching with zips) draws <- round((draws - min(draws))/(max(draws) - min(draws))*100, 1) colnames(draws) <- topics draws <- arrange(as.data.frame(draws), -`Low Income`) # Attach values to zipcodes zips@data <- cbind(zips@data, draws) %>% within(topicPop <- paste0("Topic Scores for Zip ", ZIP, ":
", "Pregnant: ", sprintf("%1.1f", Pregnant), "
", "Domestic Violence: ", sprintf("%1.1f", `Domestic Violence`), "
", "Opiates: ", sprintf("%1.1f", Opiates), "
", "Suicide: ", sprintf("%1.1f", Suicide), "
", "Low Income: ", sprintf("%1.1f", `Low Income`))) palBin_blue <- colorBin(palette = "Blues", domain = c(0, 100), bins = 5) myOpacity <- 0.7 leaflet(data = zips) %>% # width = "100%" #addTiles() %>% addProviderTiles("CartoDB.Positron") %>% addPolygons(group = "Pregnant", fillColor = ~palBin_blue(Pregnant), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity) %>% addPolygons(group = "Domestic Violence", fillColor = ~palBin_blue(`Domestic Violence`), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity) %>% addPolygons(group = "Opiates", fillColor = ~palBin_blue(Opiates), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity) %>% addPolygons(group = "Suicide", fillColor = ~palBin_blue(Suicide), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity) %>% addPolygons(group = "Low Income", fillColor = ~palBin_blue(`Low Income`), popup = ~topicPop, weight = 2.0, fillOpacity = myOpacity) %>% addLayersControl(baseGroups = c("Pregnant", "Domestic Violence", "Opiates", "Suicide", "Low Income"), options = layersControlOptions(collapsed = FALSE)) %>% addLegend(position = "bottomleft", colors = unique(palBin_blue(0:100)), title = "Topic Scores", labels = c("0-20", "20-40", "40-60", "60-80", "80-100")) ``` *** * Dynamic maps offer panning, zooming, and "popup" annotations that allow exploration of geographic trends in various case topics and combinations thereof, such as "opiod" and "overdose" * This map shows an example of multiple topic layers for Alleghency county zip codes, with more information available through clicking